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(54) Title: IN VITRO EVOLUTION OF ENZYME SPECIFICITY 

(57) Abstract: This invention relates to the field of chemical synthesis and is particularly, though not exclusively, concerned with a 
method of producing active molecules, such as enzymes with desirable characteristics. In particular, the invention provides a method 
of producing new active molecules comprising at least 2 rounds of mutating a nucleic acid encoding an active molecule and selecting 
encoded active molecules which have activity against a substrate, wherein from round to round the substrate used differs from the 
previously used substrate by one or more minor differences. 
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In vitro evolution of enzyme specificity 

Field to which invention relates 

5 This invention relates to the field of chemical synthesis and is particularly, though not 
exclusively, concerned with a method of producing active molecules, such as enzymes with 
desirable characteristics. 

Background to invention 

^ The primary focus of drug production in the pharmaceutical industry is on small-molecule 
(MW <1000 Da) chemicals, as these compounds are often orally available (Buckland B. C. 
et al (2000) Metab. Eng. 2: 42-48). An increasing awareness of the alternative activities 
exhibited by different optical isomers of the same drug has led to an increased pressure to 
manufacture optically pure therapeutics. Syntheses of such pure compounds by chemical 
methods often require several expensive protection and de-protection steps to achieve 

15 selectivity. By contrast, enzymes achieve higher selectivity in a single step. However, the 
use of enzymes in the synthesis of complex molecules is currently hindered by the time 
taken to discover or develop an enzyme with the required substrate specificity, as compared 
to optimising established chemical transformations. Even with an increasing number of 
known enzymatic reactions, identifying a suitable biocatalyst is extremely difficult, as the 
known enzymes often do not show activity towards the desired substrate. 

20 

Although forced evolution has shown great potential to enhance the properties of enzymes, 
particularly for hydrolases, enzymes derived by the method are still not widely used in the 
manufacture of pharmaceuticals. A major bottleneck is the identification of a suitable 
enzyme that is capable of the desired reaction. While databases such as ENZYME 
(Bairoch A. (2000), The ENZYME database in 2000. Nucleic Acids Research 28 
pp304-305) usually identify a suitable class of enzyme capable of the desired chemistry, it 
is much more difficult to find an example of that enzyme with activity towards a particular 
substrate, due to the high substrate specificity exhibited by most natural enzymes. The 
chemical route is therefore used in drug production rather than the enzymatic route despite 
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the selectivity drawbacks and typically higher process costs of many chemical catalysts 
(Thayer, A. M. (2001) Chemical & Engineering News 79: 27-34). 

To produce enzymes having improved properties suitable for use in chemical production 
mutant genes encoding the proteins have to be created and expressed In the past, the only 
5 way to obtain mutations was to isolate naturally occurring mutants with strain-screening 
methods. However, the rate of natural mutation is very low, and mutants have to be 
generated by treatment with mutagenic agents such as chemical mutagens or UV light 
Moreover, even if an accurate screening test was available, isolation of mutants that were 
lethal or that did not produce observable changes was not possible (Zaccolo, M, et al, 
(1996) J. Mol Biol, 255, 589-603). 

10 

Random mutagenesis is now, usually, achieved by PCR methods. These methods have 
revolutionised the means by which mutants are obtained because they are more precise and 
more efficient, yielding mutations in 50-100% of the proteins created, than phenotypic 
screening (Smith, M., Biochimie, (1985) 67, 717-723, & Zoller, M., (1999) Curr. Opin. 
Biotech., 2, 526-531). 

15 

In contrast to natural evolution, directed evolution has a defined goal, which is to generate 
large pools of molecular variants at the DNA level from which proteins with the desired 
properties can be selected. As opposed to protein engineering by design, directed evolution 
does not require a prior knowledge of the protein structure (Manuela Zaccolo and Ermanno 
20 Gherardi., (1999) J, Mol Biol, 285, 775-783). Furthermore, it relies on the fact that 
proteins can tolerate a number of amino acid residue substitutions without dramatic effect 
on folding or stability (Axe et al, (1996) Proc. Natl Acad Sci. USA, 93, 5590-5594; 
Bowie et al, (1996) Methods Enzymol, 266, 598-616). The other fact is that natural 
evolution has only screened for a subset of potentially useful sequences and therefore an 
unexplored sequence space can reveal better solutions to biological problems (Manuela 

25 

Zaccolo and Ermanno Gherardi., J. Mol Biol, (1999) supra). In practice, the mutational 
load representing the number of mutations induced into the gene cannot exceed a certain 
rate and the size of the mutant DNA libraries are limited to about 10 6 to 10 13 clones 
depending on the mode of screening or selection. 
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Random mutations have previously been introduced into a stretch of DNA using non-PCR 
methods such as chemical mutagenesis, UV irradiation, mutator strain and poisoned 
nucleotides (Kuchner and Arnold, 1997) Trends BiotechnoL, 15, 523-530). These 
methods are reported to be successful but they suffer from a number of disadvantages. 
First, the mutations can affect any gene in the organism's genome and the average number 
of mutated genes of interest can be very low. Second, prior to DNA cloning and 
sequencing technologies there was no way of knowing the location and the type of 
mutations being induced into the genome. 

Therefore there was a huge necessity for biologists to refine mutagenesis techniques and 
10 combine it with PCR amplification methods in order to generate higher numbers of 
mutants with more precise mutations. These methods have revolutionized the means by 
which mutants are obtained (Smith M et al, (1985) Biochimie, 67, 717-723, & Zoller, 
M.J., Curr. (1991) Opin. Biotech., 2 (4), 526-531). 

Among the methods that introduce random mutations in the entire gene is Error-prone PCR 
^ (epPCR). The approach has successfully been applied to engineer new protein functions 
and to improve the catalytic activity. Another technique is combinatorial cassette 
mutagenesis, which uses oligonucleotides containing randomised codons as mutagenic 
cassettes for their introduction into the gene of interest by PCR methods (Reidhaar-Olson 
et al, (1999) Methods EnzymoL, 208, 564-586). Moreover, a DNA shuffling method 
developed for random in vitro DNA recombination represents a significant advance in the 
applications of directed evolution methods (Stemmer et aL, (1994) Nature, 370, 389-391). 

A typical cycle of directed evolution starts with the selection of DNA sequences encoding 
proteins that involve to some extent the sought after property. The diversity of the 
sequences is then increased through the mutagenesis step by introducing random point 
25 nucleotide mutations and amplifying the DNA fragment using epPCR. These DNA 
sequences are then cloned into an expression vector and transformed into competent E. 
coli cells. A screening procedure is then employed to isolate the transformant K coli cells 
containing the mutated PCR amplified DNA fragments encoding proteins with improved 
characteristics. 
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The selected sequences are then amplified again so that the mutagenesis, amplification and 
screening are repeated many times until the proteins with the desired properties or 
functions are obtained. 

5 

A remarkable success has been achieved in the industrial application of directed evolution 
to improve the activities and thermostabilities of vaccines and pharmaceuticals 
(Schmith-Dannert & Arnold, (1999) Trends BiotechnoL, 17, 135-136). These successful 
applications have proved the different possibilities for future uses of directed evolution in 
understanding protein functions and the production of novel biocatalysts (Gregory L.Moore 
10 and Costas DMaranas, (2000) J. Theor. BioL, 205 (3), 483-503). 

The technique of DNA shuffling creates gene libraries, containing combinations of 
mutations derived from a set of homologous DNA sequences or arising as a result of point 
mutations (Kuchner and Arnold., (1997) Trends BiotechnoL, 15, 523-530). Recombination 
serves to promote positive traits and to eliminate negative traits in the progeny, resulting in 
^ a rapid accumulation of beneficial mutations in separate genes. 

In the Error-prone PCR method, the gene of interest is amplified after many PCR cycles 
under conditions that increase normal mis-incorporation errors. The error prone PCR 
replication process (Cadwell & Joyce, (1994) PCR Meth Appl., 3(6), S136-S140) 
intentionally introduces copying errors by imposing mutagenic reaction conditions, based 
on the parameters below: 

1 . The error rate (fidelity) of the Taq polymerase. 

2. The length of the mutagenised gene and the number of effective doubling cycles. 

25 

3. The concentration of MnCk affecting the rate of mutations induced by the Taq 
polymerase. 
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The first step of PCR is the denaruration of the DNA into 2 single strands. The second step 
is the annealing of a primer to the DNA single strands. The third step is the extension by 
Taq polymerase. 

Nucleotides complementary to the single strand template are added by using the original 
5 sequence as a template, extending the complementary strands until the normal DNA double 
strands are recovered. Most mutations occur in this step where the non-complementary 
nucleotides are incorporated into the chain. The mutation rates induced by Taq range from 
10- 7 up to 10- 3 per nucleotide polymerized, as reported by Eckert & Kunkel, (1990) Nucleic 
Acids Research, 18, 3739-3744. However these mutations are nucleotide dependent 
10 (Cadwell & Joyce., (supra); Shafikhani et al, (1997) Biotechniques, '23, 304-310). 
Therefore the monitoring of these variable replication errors is crucial for the mutagenesis. 

Examples of applications of epPCR already in use, include improved solvent, 
thermostability and enhanced specific activity of enzymes and proteins (Chen and Arnold., 
P. (1993) Natl. Acad. Set. USA, 90 (12), 5618-5622; Giordano et al, (1999) 
15 Biochemistry-US, 38 (10), 3043-3054; Heneke and Bomscheuer., (1999) Biol Chem., 380, 
(7-8), 1029-1033; Moore and Arnold., (1996) Nat. BiotechnoL, 14, 458-467; Shibata et al, 
(1998) Protein Eng., 11 (6), 467-472; You and Arnold,. (1996) Protein Eng., 9 (1), 77-83). 
Therefore specific protein activities and functions that never occur in nature can be easily 
generated. 

20 The method of epPCR is subject to several disadvantages. First, there is a tendency for 
neutral and deleterious mutations to accumulate in the selected progeny sequences, 
increasing with the number of cycles of epPCR. Second, the random mutagenesis is 
directed across the entire gene sequences in the hope that a mutation resulting in the 
desired improvement will be found within the generated library. As a result, most mutants 

?5 that are formed, are non-functional (deleterious), or have no effect on the desired property 
(neutral). Given the restrictions on library size that can be searched in a practical manner, 
these redundant mutations further limit the useful sequence space that can be searched. 



Combinatorial cassette mutagenesis is sometimes used to mutate selected sequences and 
therefore reduces the sampling of redundant sequence space. However, the level of 
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mutation can no longer be varied using the method, unless new DNA primers are 
synthesized for each mutational load required. Control of mutational load is desired in 
order to control the generated sequence space, and hence the library size, to within a 
practical limit It is also extremely difficult to generate combinatorial cassettes that contain 
only non-disruptive mutations, i.e. encoding only amino acids with similar 
physico-chemical properties to that of the original wild-type amino acid. Altering an amino 
acid in a protein sequence frequently disrupts the structure or function of the protein, 
resulting in the need to search again through redundant sequence space. 

In our copending patent application PCT application WO 03/004595 filed 5th July 2002 
there is disclosed an improved method of epPCR which allows hyper-mutation by "focused 
error-prone PCR" at a specific and selected active site of a nucleic acid or polypeptide. 
The approach is analogous to phage-display libraries and the natural repertoire of 
antibodies, in that only the active site of the displayed protein is randomised. 
Phage-display has been used successfully to obtain tighter ligand binding, and also to 
obtain "catalysis" of a reaction. However, phage-display relies on the linkage between 
15 catalysis and a binding property in order to select variants from a large library, thus limiting 
its potential to the selection of single turnover events and not true catalysis. Focused 
error-prone PCR is a hybrid approach that combines a direct assay for nucleic acid or 
polypeptide function and the focused sequence randomization of phage display. 



20 Focused epPCR is a novel method based on conventional epPCR in which a nucleic acid 
fragment is amplified by PCR using Taq polymerase. Taq polymerase has a low fidelity of 
replication due to a lack or reduction of 3-5' exonuclease proof-reading activity. The rate 
of mutagenesis and hence the mutational load can be altered by varying the concentration 
of Mn 2+ in the PCR reaction according to previously established equations (Fromant et aL 9 
(1995) Anal Biochem., 224, 347-353,). During amplification, the sequence between the 
two PCR primers becomes mutated at random and the sequence of the primers can also 
become mutated at random, although generally to a lesser degree than the sequence 
between the primers. 



WO 2004/024918 PCT/GB2003/003958 

7 

Focused epPCR takes advantage of the feet that the majority of residues important for 
nucleic acid function (e.g. promoter or enhancer activity) or polypeptide function (e.g. 
catalysis or substrate binding) make up only a small proportion of the entire nucleic acid 
sequence or protein in most cases (Clackson et al, (1998) J. Mol Biol, 277, 1111-1 128). 
The primers for PCR are thus chosen to complement the sequence at either side of these 
short and specific regions to be randomised. The primers may complement the sequences 
within the short and specific regions or may complement sequences just outside the short 
specific regions to be amplified. The result is that only those regions of the nucleic acid or 
polypeptide comprising the active site are randomized. 

20 hi particular the copending application discloses a method of randomly modifying a 
specific region of a functional nucleic acid sequence or polypeptide sequence while 
maintaining the remaining sequence so as to arrive at a functional nucleic acid or 
polypeptide with improved characteristics. 

Specifically, the copending application discloses a method of producing a modified 
1 ^ polypeptide with improved characteristics comprising the steps of: 

(a) obtaining nucleic acid primers which flank an active site within a parent nucleic acid 
sequence encoding a parent polypeptide; 

(b) carrying out a polymerase chain reaction (PCR) using said primers and the parent 
^ nucleic acid sequence as a template under suitable conditions for introducing mutations 

into the amplified active site sequence; 

(c) isolating said mutated active site; 

(d) introducing said mutated active site into the parent nucleic acid sequence to replace the 
non-mutated active site thereby producing a modified nucleic acid sequence, or introducing 
said mutated active site into a template nucleic acid sequence to produce a modified 

25 nucleic acid; and 

(f) expressing said modified nucleic acid sequence to produce a modified polypeptide. 



Forced evolution has been previously used to alter the specificity of an enzyme towards 
substrates that were already poorly accepted by the wild type enzyme (Arnold, R H. et al 
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(1999) Curr, Opin Chem. Biol, 3: 54-59; May, O. et al (2000) Nat Biotechnol 18: 
317-320). Ellington and co-workers have since demonstrated that substrate specificity 
initially broadens as an enzyme is evolved for improved activity towards poorly accepted 
substrates (Matsumura, I. et al (2001) J. Mol Biol 305: 331-339). Specifically, the 
Escherichia coli P-glucuronidase (GUS) was evolved through three rounds of DNA 
shuffling and screening in vitro to catalyze the hydrolysis of a P-galactoside substrate 500 
times more efficiently (kcat/KJ than the wild-type GUS which only has weak (3 
galactosidase activity, with a 52 million-fold inversion in specificity. The kinetic behaviour 
of the purified mutant proteins in reactions with a series of substrate analogues showed that 
certain mutations account for the changes in substrate specificity, and that they were 

10 synergistic. They noted that, during a forced evolution experiment, a second evolutionary 
intermediate of GUS, unlike the wild-type and evolved forms, exhibited broadened 
specificity for substrates Pnp-fucoside and oNP-galactoside-6-phosphate dissimilar to 
either glucuronides or galactosides and on which the wild-type GUS showed no detectable 
activity. These results were indicated as being consistent with the "patchwork" hypothesis, 
which postulates that modern enzymes diverged from ancestors with broad specificity. The 

15 minor changes in substrate are illustrated in the accompanying drawing Fig 1. In other 
words, previously undetectable activity towards novel substrates was obtained 
unintentionally during forced evolution, where the new substrate differed by a few small 
chemical structure changes that could be accepted by the GUS enzyme after a single 
mutation (N566S) in one round of random or directed mutagenesis. In one aspect of the 

2Q present invention, substrate specificity of an enzyme is intentionally changed to act on a 
desired substrate where the wild-type enzyme shows substantially no previously detectable 
activity. By repeating this several times on sequential substrates an enzyme's substrate 
specificity can be directed along an "evolutionary pathway" comprising stepwise 
modifications of the substrate structure. This can then be repeated with each new enzyme 
on substrates that are increasingly distant in structural and chemical-space from the original 

25 substrate. By making stepwise modifications of the substrate it is possible to generate an 
enzyme which has activity against a desired substrate that differs from the substrate of the 
wild-type enzyme to such a degree that it would be substantially impossible to generate an 
enzyme having activity against the desired substrate by randomly mutating the wild-type 
enzyme and selecting for activity against the desired substrate. 



WO 2004/024918 PCT/GB2003/003958 

9 

It is an object of the invention to provide an efficient method of developing active 
molecules, particularly enzymes with improved properties and, in particular, specificity for 
a substrate of commercial interest 



5 

Disclosure of the invention 

According to one aspect of the invention there is provided a method of producing new 
active molecules, the method comprising: 

i) a first round comprising mutating a nucleic acid encoding a starting active molecule, 

10 

which has activity against a starting substrate, to produce one or more second active 
molecules and detecting activity of the one or more second active molecules against a 
second substrate on which the starting active molecule has substantially no activity, and 
selecting one or more second active molecules which have activity against the second 
substrate; and 

15 

ii) a subsequent round comprising mutating a nucleic acid encoding the one or more 
selected second active molecules to produce one or more third active molecules and 
detecting the activity of one or more third active molecules against a third substrate on 
which the second active molecule has substantially no activity, and selecting one or more 
third active molecules which have activity against the third substrate, 

20 

wherein the third substrate is sufficiently different in structure from the starting substrate so 
that the starting active molecule will have no activity against the third substrate and it 
would be substantially impossible to obtain an active molecule having activity against the 
third substrate performing a single round of random mutagenesis on the starting active 
molecule. 

25 

The basis of the invention is that active molecules against the final substrate (e.g. the third 
substrate) can only be obtained by using one or more intermediatory substrates. The 
distance between the structure (and chemical space) of the starting substrate and the final 
substrate is so large that it would be practically impossible to obtain an active molecule 
having activity against the final substrate by performing a single round of random 
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mutagenesis using current methods. The degree of mutagenesis of the active molecule 
required will be so large that the number of mutants generated will be so large that it will 
be practically impossible to screen in an efficient manner. 

The term "a single round of directed or random mutagenesis" means that the starting active 
5 molecule can be randomly mutated using any current random mutagenic technique, 
including directed evolution, DNA shuffling, etc., and that the mutated active molecules 
produced are only tested for activity against the final substrate. 

The present method provides a more efficient method for obtaining active molecules 
having activity against a desired substrate as a series of substrates bridging the structural 
1 0 gap between the starting substrate and the final substrate are used. 

The active molecule can be any active molecule encoded by a nucleic acid which has a 
detectable activity against a substrate. The active molecule may be DNA, RNA or protein 
including peptides. Preferably, the active molecule is an enzyme, a receptor, an antibody 
molecule, an antigen, a ligand for a receptor or a substrate for an enzyme. Suitable 
15 receptors include cytokine receptors, ion channels, etc. Suitable ligands include cytokines 
and hormones. Most preferably, the starting active molecule will be an enzyme. 

The substrate may be any substrate on which the active molecule has a detectable activity. 
The substrate may be a protein, a nucleic acid, a carbohydrate, an imprinted polymer, an 
organic or inorganic chemical compound. For example, when the active molecule is an 
enzyme, the substrate is an enzyme substrate; when the active molecule is a receptor, the 
substrate is a receptor ligand; when the active molecule is a receptor ligand, the substrate is 
a receptor; when the active molecule is an antibody, the substrate is an antigen, etc. 
Preferably the substrate is an enzyme substrate. 



25 



The activity can be any detectable activity, such as an enzymatic activity, which can be 
measured by detecting the product or by-product of the enzymatic reaction. The detectable 
activity may be the binding of the active molecule to the substrate, which can be measured 
by detecting the bound complex or the amount of free active molecule or substrate. It is 
particularly preferred that the active molecules identified as having activity against the 
substrate have greater activity against the substrate than the previous active molecule. The 
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active molecules preferably have at least 2 times, more preferably at least 10 times and 
most preferably at least 100 times as much activity against the substrate as the previous 
active molecule. 

Therefore the invention provides a method for forcing the evolution of active molecules 
5 such as enzymes which use different substrates. The invention provides the foundation of a 
technology by which active molecule variants with the desired substrate specificity can be 
identified while preserving the high selectivity that the active molecules typically achieve. 
The method of the invention would greatly reduce the time taken to develop a suitable 
biocatalytic route for pharmaceuticals, agrochemicals and selected fine chemicals. 

10 Basically, the invention provides a method of producing new active molecules comprising 
at least 2 rounds of mutating a nucleic acid encoding an active molecule and selecting 
encoded active molecules which have activity against a substrate, wherein from round to 
round the substrate used differs from the previously used substrate by one or more minor 
structural or chemical differences. 

15 Although the mutation step can be carried out using any suitable method known to the 
skilled worker, including the random mutagenesis processes described above, according to 
a preferred aspect of the invention, the mutation of the nucleic acid is performed using the 
focused error-prone PCR technique described above. Specifically, this aspect of the 
invention provides a method of producing new active molecules, the method comprising: 

20 i) a first round comprising mutating a nucleic acid encoding a starting active molecule, 
which has activity against a starting substrate, to produce one or more second active 
molecules and detecting activity of the one or more second active molecules against a 
second substrate on which the starting active molecule has substantially no activity, and 
selecting one or more second active molecules which have substantial activity against the 
second substrate; and 

25 

ii) a subsequent round comprising mutating a nucleic acid encoding the one or more 
selected second active molecules to produce one or more third active molecules and 
detecting the activity of one or more third active molecules against a third substrate on 
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which the second active molecule has substantially no activity, and selecting one or more 
third active molecules which have substantial activity against the third substrate, 

in which the mutation steps in one or more such rounds are conducted by 

5 (a) obtaining nucleic acid primers which flank an active site within a parent nucleic acid 
sequence encoding a parent active molecule; 

(b) carrying out a polymerase chain reaction (PCR) using said primers and the parent 
nucleic acid sequence as a template under suitable conditions for introducing mutations 
into the amplified active site sequence; 

10 

(c) isolating said mutated active site; 

(d) introducing said mutated active site into the parent nucleic acid sequence to replace the 
non-mutated active site thereby producing a modified nucleic acid sequence, or introducing 
said mutated active site into a template nucleic acid sequence to produce a modified 
nucleic acid; and 

(e) expressing said modified nucleic acid sequence to produce a modified active molecule, 

wherein the third substrate is sufficiently different in structure from the starting substrate so 
that the starting active molecule will have no activity against the third substrate and it 
2® would be substantially impossible to obtain an active molecule having activity against the 
third substrate by performing a single round of random mutagenesis on the starting active 
molecule. 

The method of the invention may comprise further rounds of mutation involving further 
substrates and active molecules and testing the activity of the active molecules produced 
^ during each round sufficient to provide active molecules which act on substrates to effect 
reactions which are of significant commercial value. The method of the invention may 
involve any number of rounds of mutation and testing. Typically, it may involve 2 to 10 
rounds. The number of rounds required will depend on the difference in structure between 
the starting substrate and the final substrate. 
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Typically the third and subsequent active molecules will be produced as part of a selection 
of mutated active molecules which are then selected on the basis of their ability to act on a 
specified substrate. In particular, when the active molecules are enzymes they are selected 
on the basis of their ability to catalyse a reaction of interest on a specified enzyme 
substrate. 

5 

The invention requires that the previous active molecule (e.g. the starting active molecule 
or the second active molecule has substantially no activity on the new substrate (e.g. the 
second substrate or the third substrate, respectively). The term "substantially no activity" 
as used herein means that the previous active molecule may be active against the new 
substrate at a level of 0 to 50%, preferable 0 to 20%, more preferably 0 to 10%, and most 

10 

preferably 0 to 5% of its activity against the previous substrate. As a specific example, the 
second active molecule may be active against the third substrate at a level of 0 to 50% of 
its activity against the second substrate. 

It is indicated that the starting active molecule will have no activity against the third 
substrate. This means that the starting active molecule will have no detectable activity 

15 

against the third or subsequent substrate. It is also preferred that any active molecule will 
have no detectable activity against a substrate used 2 or more rounds later in the method of 
the present invention. Such subsequent substrates differ in structure from the substrate on 
which the active molecule does have activity to such a degree that the active molecule will 
not have any detectable activity against the subsequent structures. 

20 

The new active molecule must have activity against the new substrate but may also have 
activity against one or more of the previous substrates. For example, the third active 
molecule must have a detectable activity against the third substrate and may have activity 
against the second substrate. In some situations it may be desirable for the new active 
molecule to have activity against both the new substrate as well as one or more of the 
25 previous substrates. The method of the present invention can therefore be easily modified 
by testing the new active molecule on the new substrate as well as on one or more of the 
previous substrates so that active molecules with the desired activity can be selected. By 
using this method active molecules with an increased range of substrates can be obtained. 
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Alternatively, the new active molecule can be selected so that it does not have activity 
against the previous substrates or only has minimal activity against the previous substrates, 
for example less than 10% , more preferably less than 5% of the activity of the new active 
molecule against the new substrate. Using this method an active molecule with a very 
narrow range of substrates can be obtained. 

As indicated above, the active molecule is preferably an enzyme. Enzymes for use as a 
starting point in the method of the invention, the first enzyme, can be readily selected by 
the skilled addressee for example from databases such as ENZYME. One suitable type of 
enzyme is the transketolases. Other enzymes can be selected from the other classes of 
enzymes i.e the oxidoreductases, transferases, hydrolases, lyases, isomerases, and ligases. 

10 

As indicated above, the invention can also be extended to ligand binding to any protein 
scaffold. 

The substrates used in the invention may be selected by the skilled addressee. Preferably 
each substrate is chosen to have one or more minor differences over the previous substrate. 
For example, and where the substrates are chemical compounds, the two substrates may 
differ by the substitution of a hydrogen with a methyl group or the substitution of a methyl 
group by a hydroxyl group, the removal or addition of a chemical group (e.g RCH3 to 
RCH2CH3 or the opposite), inversion or addition of chemical groups around a central atom 
(e.g S enantiomer to R enantiomer), oxidation or reduction of a chemical group (eg NO2 to 
NH 2 or the reverse), change in bonding by oxidation or reduction (e.g single bond to double 
20 or triple bond, alkane to alkene or alkyne; or the reverse), etc. 

Where the substrates are proteins, the two substrates may differ by a few amino acids (e.g. 
1 to 10 amino acids). The modifications made will depend on the activity being sort. For 
example, modifications can be made to amino acids in a protein to alter the structure of the 
protein, or to the alter the active site of the protein. Preferably the few amino acids that 
25 differ interact directly with the active molecule. Those skilled in the art could determine 
what modifications to make in order to achieve the substrate having the desired function 
and/or structure. Furthermore, standard techniques for altering the amino acids of a protein 
are well known to those skilled in the art Where the substrates are nucleic acids, the two 
substrates may differ by a few nucleotides (e.g. 1 to 10 nucleotides). As indicated above 
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with respect to the protein substrates, the modification made will depend on the activity 
being sort. Standard techniques for altering the nucleotide sequence of a nucleic acid are 
well known to those skilled in the art 

The substrates used in the invention are preferably a series of intermediates wherein 
5 stepwise modifications of the substrate's structure are made. The starting active molecule 
is active against a starting substrate and the new active molecule produced by the method 
of the present invention is active against a desired substrate. Each subsequent substrate 
used in the method of the invention is modified to be closer in structure to the desired 
substrate than the previous substrate. Accordingly, each substrate used in the method can 
be seen as a stepping stone from the starting substrate to the desired substrate. By using 

10 

such a series of substrates, the "evolution" of the active molecule is directed toward the 
desired substrate so that an active molecule having activity against the desired substrate is 
obtained. 

The desired or final substrate is so different from the starting substrate that it is 
substantially impossible to obtain an active molecule having activity against the desired 
substrate by performing a single round of random mutagenesis. Only by using a series of 
substrates that differ from each other by relatively minor differences is it possible to 
efficiently obtain an active molecule having activity against the desired substrate. 

It is preferred that the differences between each subsequent substrate are relatively small so 
that active molecules having activity against the new substrate can be obtained. Preferably, 

20 

each subsequent substrate differs by less than 20%, more preferably less than 10%, from 
the previous substrate. Where the substrate is a chemical compound, it is preferred that less 
than 20%, more preferably less than 10% of the substituent groups are changed. Where 
the substrate is a protein it is preferred that less than 20%, more preferably less than 10% of 
the amino acids are changed. Where the substrate is a nucleic acid, it is preferred that less 
25 than 20%, more preferably less than 10% of the nucleotides are changed. 

The new active molecules produced by the method of the present invention can be used to 
perform a reaction of interest The reaction of interest may be the binding of a specific 
ligand or receptor, the digestion of a carbohydrate or a particular peptide or nucleotide 
sequence, protecting a particular substrate from degradation, etc. Particularly preferred 
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reactions of interest include the isolation of optical isomers of compounds that have 
therapeutic value. Other preferred reactions of interest include reactions that produce 
compounds that are high-value intermediates to pharmaceuticals, fine chemicals or 
agrochemicals; reactions that degrade toxic compounds (bioremediation); and reactions 
towards analytes, e.g using enzymes in diagnostics or biosensors. 

5 

Once an enzyme has been produced which is able to catalyse a reaction of commercial 
interest it can be further modified to improve its properties such as thermal stability, kinetic 
activities, etc. 

The method of the invention may be automated. 

10 

According to another aspect of the invention there is provided an active molecule such as 
an enzyme obtained or obtainable by a method in accordance with the invention. 
Preferably the enzyme is obtained by the method of the present invention. 

According ,to a further aspect of the invention there is provided a method of producing a 
compound of interest, the method involving the use of an active molecule, such as an 
enzyme, produced by the method according to the present invention in the production of 
the compound. 

Brief Description of the Drawings 

20 

Methods in accordance with the invention will now be described, by way of example only 
with reference to the further accompanying drawings Figures 2 to 3 in which: 

Fig 2 shows a series of substrates that can be used in a method according to the invention; 
and 

^ Fig 3 shows an assayable reaction scheme for wild-type transketolase. 



Examples 
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Forced evolution of transketolase and selection of modified enzymes 

Transketolase is an important enzyme for the synthesis of asymmetric C-C bonds with a 
new chiral carbon centre with up to 100% selectivity. Transketolase thus has potential use 
in the synthesis of compounds such as novel sugars, amino-acids, peptides and polyketides 
for new antibiotics, antihypertensive vasopeptidase inhibitors, HIV protease inhibitors, 
antiviral medication, treatments for rheumatic arthritis, and antitumor compounds (Krix, G. 
et al (1997) Journal of Biotechnology: 53: 29-39; Szarka, L. et al (1999) Bioorganic and 
Medicinal Chemistry 7: 2247-2252; Bommarius, A et al (1998) Journal of Molecular 
Catalysis B: Enzymatic 5: 1-11). The substrate specificity of transketolase has only been 
previously altered by site-directed mutagenesis studies (J Ward (UCL) unpublished), and 
only for the aldehyde acceptor substrate. The structure of enzyme transketolase is available 
(Littlechild, J. A. et al (1995) Acta Crystallographica Section D-Biological 
Crystallography 51: 1074-1076). The donor substrate binds deep into the transketolase 
active-site to form many specificity-determining contacts. This makes transketolase a good 
test system for demonstrating the method of this invention permitting the expansion of the 
substrate specificity for the donor ketol substrate, which is currently limited to sugars and 
P-hydroxypyruvate. 

The series of substrates shown in Figure 1, is an example of such a series that can be used, 
though there are many other commercial compounds that can extend this series as 
necessary. Forced evolution techniques such as error-prone PCR, mutator strains and DNA 
shuffling are preferably used to alter the specificity of transketolase at each 'step 1 of the 
substrate series. 

The first step in the series to be explored is a change in polarity from a hydroxyl to a 
methyl group, without a significant change in size. Evolution of the transketolase enzyme 
is likely to produce a corresponding change from a hydrogen-bonding residue in the 
enzyme structure to a hydrophobic one. The hydroxyl group of the p-hydroxypyruvate has 
no direct influence on the mechanism of transketolase, although its presence may slightly 
diminish activation of the ketone. The following illustrative steps, changing the substrate 
from 2 ketobutyric acid to 2 ketovaleric acid and then to 2 ketoisocaproic acid, involve an 
increase in the size and branching of the hydrophobic side-chain of the substrate. 
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Evolution of the enzyme may then increase the cavity size of the active site of the resulting 
mutated enzyme complementary to this region of the substrate. 

The set of new substrates SI to S3 (Figure 1) are first assessed for their activity towards the 
wild-type transketolase enzyme. The sequence of wild type TK is given at Genbank 
accession no. NIM17410 and GL16130836 from E. coli K12. Ref: Sprenger GA (1991). 
Transketolase from E. coli K-12 - DNA-sequence of the gene and purification of the 
enzyme from recombinant strains. Biol Chem. 372(9): 759-759. A suitable assay uses 
p-hydroxypyruvate and glycolaldehyde as the ketol and aldehyde substrates respectively 
(Figure 3). The change in absorbance of Cresol-red indicator is monitored as the pH 
increases due to consumption of the acidic p-hydroxypyruvate substrate. All of the new 
substrates contain this acidic carboxylate group as it is necessary for the reaction to proceed 
irreversibly. An HPLC assay for TK has been produced (C. Ingram, P. Dalby, G.Lye, 
unpublished) and may be used to design a chiral HPLC assay for verifying the product 
enantioselectivity. A suitable chiral HPLC system uses Chromtech columns. 

Two librarys of transketolase mutants in E. coli BL21 DE3 (0. Miller, G. Lye & P. Dalby, 
unpublished) are produced, one by focussed epPCR, the other produced using XL- 1 -red 
mutator strain (Stratagene) maintained as glycerol stocks in 96-welI deep-well plates at 
-70°C. For screening, the glycerol stock plates are replicated into another deep-well plate 
with 0.5ml LB medium containing 50 jig/ml of Ampicillin. Incubated overnight at 37°C 
with shaking. The library is assessed in 96-well microplates for activity towards all three 
new substrates SI, S2 and S3 (Figure 1) using an absorbance microplate reader and the 
assay shown in Figs. 2 and 3. For the assay, each micro-well contains 150 jil of lysed (1 
cycle of freeze-thaw using -70°C freezer) culture: 50 ^1 of 60 mg/1 Cresol red pH-indicator 
solution; 50 fil of 300 mM glycolaldehyde solution and 50 ^il of 600 mM ketol donor 
solution (substrates SI, S2, S3, etc). Reaction is followed by absorbance at 560 nm using a 
Fluostar Optima (BMG Labtechnologies) plate reader for at least 2 hours. This allows us 
to identify new enzyme variants capable of activity towards SI, and simultaneously assess 
whether activity can be obtained towards S2 or S3 in a single round of error-prone PCR 
(average of one mutation per gene). This will characterise the extent of substrate 
structural-space that can be explored in a single round of forced evolution. The enzyme 
variants that show activity on the substrate with the greatest modification (S3>S2>S1) are 
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then isolated for further rounds of evolution and characterisation by sequencing. If activity 
is obtained directly on S3 in the first round, further commercial substrates can be chosen 
with even greater modifications as a new S2 and S3 (e.g. benzeneglyoxylic acid or 
imidazolepyruvic acid). 

5 The best mutant(s) isolated above are then subjected to a second round of forced evolution 
using DNA shuffling or error-prone PCR and similarly assessed for activity towards SO, 
SI, S2 and S3. This characterises the extent to which the substrate specificity is 
broadening or narrowing, and also the extent to which activity towards a new substrate (S2 
or S3) is being obtained. The best mutants are then isolated and characterised. 

W This iterative process is then repeated until activity towards S3 can be enhanced no further 
in terms of high activity and selectivity. The approach outlined, whereby activity towards 
all substrates is continually assessed, allows the identification of the optimum process for 
evolving specificity towards the final substrate. For example whether evolution towards 
each new substrate can be attempted in successive rounds, or whether some enhancement 
must be obtained for each substrate before proceeding to the next one. 

15 

The products of each reaction for the best mutants obtained can also be confirmed at each 
round of evolution using mass-spectrometry. The sequences of the obtained enzymes can 
be used to rationalise mutations with the observed changes in substrate specificity, where 
possible, by comparison with the wild-type transketolase structure. 

20 Although methods of the invention have been described with reference to transketolase, 
which has great potential in the synthesis of asymmetric C-C bonds with a new chiral 
centre, especially for novel sugar-based pharmaceuticals such as HIV protease inhibitors, 
and anti-viral, anti-rheumatoid arthritis, and anti-tumour compounds, the method of the 
invention can be readily applied to increasing the synthetic repertoire of other en2ymes, 
whether transketolases or from the other classes of enzymes mentioned above. 



25 



This new approach will allow the substrate specificity of an active molecule, e.g. An 
enzyme, to be evolved beyond the presently perceived limitations. Forced evolution can 
only currently be used to improve upon an existing active molecule's, e.g. an enzyme's, 
activity where detectable activity on a new substrate can be introduced within the first 
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round of mutagenesis. The new substrates in these cases are only slightly modified: more 
radical changes in substrate specificity are currently not directly possible. 

The method of the invention can be used to modify an active molecule, e.g. an enzyme, 
such that it can be evolved to accept much more substantially altered substrates and hence 
5 greatly extend its range of biosynthetic reactions. An efficient process, such as that 
outlined in this proposal, for obtaining activity towards non-natural pharmaceutical 
intermediates would, therefore, be highly desirable, especially as biocatalyst discovery is 
the first and most critical step in determining the feasibility of a byconversion. Those 
skilled in the art of biochemical engineering have the capacity to take such enhanced 
biocatalysts to process studies and scale-up. 

10 

Industrial Application 

Methods in accordance with the invention may be used to develop active molecules, e.g. 

enzymes, for a desired reaction, e.g. for use in the industrial production of chemicals. In 

particular, the invention will allow us to make full use of the thousands of enzyme 
15 BJ 

chemistries that exist in nature, by increasing their ability to accept a wider range of novel 

substrates. 

> 

AH documents cited above, are incorporated herein by reference. 

20 
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Claims 

1 . A method of producing new active molecules, the method comprising: 

i) a first round comprising mutating a nucleic acid encoding a starting active 
molecule, which has activity against a starting substrate, to produce one or more second 
active molecules and detecting activity of the one or more second active molecules against 
a second substrate on which the starting active molecule has substantially no activity, and 
selecting one or more second active molecules which have activity against the second 
substrate; and 

ii) a subsequent round comprising mutating a nucleic acid encoding the one or more 
selected second active molecules to produce one or more third active molecules and 
detecting the activity of one or more third active molecules against a third second substrate 
on which the second active molecule has substantially no activity, and selecting one or 
more third active molecules which have activity against the third second substrate, 

wherein" the third substrate is sufficiently different in structure from the starting substrate so 
that the starting active molecule will have no activity against the third substrate and it 
would be substantially impossible to obtain an active molecule having activity against the 
third substrate by performing a single round of random mutagenesis on the starting active 
molecule. 

2. A method of producing new active molecules, the method comprising: 

i) a first round comprising mutating a nucleic acid encoding a starting active 
molecule, which has activity against a starting substrate, to produce one or more second 
active molecules and detecting activity of the one or more second active molecules against 
a second substrate on which the starting active molecule has substantially no activity, and 
selecting one or more second active molecules which have activity against the second 
substrate; and 

ii) a subsequent round comprising mutating a nucleic acid encoding the one or more 
selected second active molecules to produce one or more third active molecules and 
detecting the activity of one or more third active molecules against a third second substrate 
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on which the second active molecule has substantially no activity, and selecting one or 
more third active molecules which have activity against the third second substrate, 

in which the mutation steps in one or more such rounds are conducted by: 

(a) obtaining nucleic acid primers which flank an active site within a parent nucleic 
acid sequence encoding a parent active molecule; 

(b) carrying out a polymerase chain reaction (PCR) using said primers and the parent 
nucleic acid sequence as a template under suitable conditions for introducing mutations 
into the amplified active site sequence; 

(c) isolating said mutated active site; 

(d) introducing said mutated active site into the parent nucleic acid sequence to replace 
the non-mutated active site thereby producing a modified nucleic acid sequence, or 
introducing said mutated active site into a template nucleic acid sequence to produce a 
modified nucleic acid; and 

(f) expressing said modified nucleic acid sequence to produce a modified active 
molecule, 

wherein the third substrate is sufficiently different in structure from the starting substrate so 
that the starting active molecule will have no activity against the third substrate and it 
would be substantially impossible to obtain an active molecule having activity against the 
third substrate by performing a single round of random mutagenesis on the starting active 
molecule. 

3. The method according to claim 1 or 2 in which the method involves at least one further 
subsequent round. 

4. The method of claim 1 , 3 or 3 in which the or each active molecule is an enzyme. 

5. A method according to claim 4 in which a finally selected enzyme is further modified 
to improve its properties. 
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6. A method according to claim 5 in which the properties are selected from thermal 
stability and catalytic activity. 

7. A method according to any preceding claim in which the reaction of interest involves 
the isolation of optical isomers of a compound. 

5 

8. A method according to any preceding claim in which the starting active molecule is a 
transketolase. 

9. A method according to any preceding claim which is arranged to run in an automated 
format. 

10 

10. The- method according to any one of the preceding claims, wherein each substrate has 
one or more minor differences from the previous substrate. 

1 1. An active molecule obtained or obtainable by a method according to any one of the 
preceding claims. 

15 
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Continuation of Box 1.2 
Claims Nos.: 11 



Present claims 1-11 relate to an extremely large number of possible 
methods (claims 1-10) and compounds/products (claim 11). In fact, the 
claims contain so many options, variables, possible permutations and 
provisos that a lack of clarity and conciseness within the meaning of 
Article 6 PCT arises to such an extent as to render a meaningful search 
of the claims impossible. 

In addition, claims 1-10 relate to methods defined by reference to a 
desirable characteristic or property, namely by referring to a "third 
substrate " which is "sufficiently different in structure from the 
starting substrate so that the starting active molecule will have no 
activity against the third substrate and it would be substantially 
impossible to obtain an active molecule having an activity agaist the 
third substrate by performing a single ound of random mutagenesis on the 
starting active molecule". 

The claims cover all methods having this characteristic or property, 
whereas the application provides support within the meaning of Article 6 
PCT and/or disclosure within the meaning of Article 5 PCT for only a very 
limited number of such methods. In the present case, the claims so lack 
support, and the application so lacks disclosure, that a meaningful 
search over the whole of the claimed scope is impossible. Independent of 
the above reasoning, the claims also lack clarity (Article 6 PCT). An 
attempt is made to define the method by reference to a result to be 
achieved. Again, this lack of clarity in the present case is such as to 
render a meaningful search over the whole of the claimed scope 
impossible. 

Although it appears that no parts of the claims are clear, supported and 
disclosed, the search has been carried out for those parts relating to 
methods wherein the starting active molecule is a transketolase and the 
substrates and detection method are as indicated in the description and 
figures. In addition, the general concept of "directed evolution" by 
consecutive rounds of mutagenesis and selection using different 
substrates was searched. A meaningful search over the whole of the 
claimed scope is Impossible. 

In addition, the compounds/products in claim 11 are defined in terms of a 
result to be achieved and defined by reference to a desirable 
characteristic or property, namely their ability to be obtained by a 
method which is itself not clear and not sufficiently disclosed (see 
above). Thus, claim 11 has not been searched. 

The applicant's attention is drawn to the fact that claims, or parts of 
claims, relating to inventions in respect of which no international 
search report has been established need not be the subject of an 
international preliminary examination (Rule 66.1(e) PCT). The applicant 
is advised that the EPO policy when acting as an International 
Preliminary Examining Authority is normally not to carry out a 
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preliminary examination on matter which has not been searched. This is 
the case Irrespective of whether or not the claims are amended following 
receipt of the search report or during any Chapter II procedure . 
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Box I Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 

This International Search Report has not been established In respect of certain claims under Article 17(2)(a) for Die following reasons: 
1. ["I Claims Nos- 
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